SUMO Substrates and Sites Prediction Combining Pattern Recognition and Phylogenetic Conservation
نویسندگان
چکیده
Motivation: Small Ubiquitin-related modifier (SUMO) proteins are widely expressed in eukaryotic cells, which are reversibly coupled to their substrates by motif recognition, called sumoylation. Two interesting questions are 1) how many potential SUMO substrates may be included in mammalian proteomes, such as human and mouse, 2) and given a SUMO substrate, can we recognize its sumoylation sites? To answer these two questions, previous prediction systems of SUMO substrates mainly adopted the pattern recognit ion methods, which could get high sensitivity with relatively too many potential false positives. So we use phylogenetic conservation between mouse and human to reduce the number of potential false positives. Results: We first use two major patterns of a potential SUMO substrate: an NLS (Nuclear Localization Signal) and a consensus motif ψ-K-X-E, where ψ meant a hydrophobic amino acid. So we followed a simple rule to predict the SUMO substrates: the sub-cellular localization of a given protein must be predicted as nuclear by PSORT II, and the protein must have at least one consensus motif ψ(A, F, I, L, M, V, W)-K-X-E. After the above methods, there are still too many predicted positives. To eliminate the potential false positives, we used the orthology information between mouse and human, and followed the rule that at least one consensus motif should be at the same position after sequence alignment of the ortholog pair. We got 2,683 potential SUMO substrates in both mouse and human, with 58 out of 79 known SUMO substrates in human predicted correctly. For the sumoylation sites prediction, our method got nearly the same sensitivity as the existed tool SUMOplot, with 42 against 42 (high probability) or 44 (all) true positives of 54 known sumoylation sites respectively. But our method outperformed SUMOplot significantly by specificity, with 74 against 152 (high probability) or 324 (all) predicted sumoylation sites respectively. So our method greatly reduced the number of potential false positives while still kept a satisfying sensitivity. Availability: The software SSP (SUMO Substrate and Site Prediction) 1.0 for Windows system, written by Delphi, is available from http://973-proteinweb.ustc.edu.cn/sumo/. Contact: [email protected], [email protected]
منابع مشابه
A genome-wide analysis of sumoylation-related biological processes and functions in human nucleus.
Protein sumoylation is an important reversible post-translational modification of proteins in the nucleus, and it orchestrates a variety of the cellular processes. Genome-wide analysis of functional abundance and distribution of Small Ubiquitin-related MOdifier (SUMO) substrates may shed a light on how sumoylation is involved in nuclear biological processes and functions. Two interesting questi...
متن کاملComplementing computationally predicted regulatory sites in Tractor_DB using a pattern matching approach
Prokaryotic genomes annotation has focused on genes location and function. The lack of regulatory information has limited the knowledge on cellular transcriptional regulatory networks. However, as more phylogenetically close genomes are sequenced and annotated, the implementation of phylogenetic footprinting strategies for the recognition of regulators and their regulons becomes more important....
متن کاملGPS-SUMO: a tool for the prediction of sumoylation sites and SUMO-interaction motifs
Small ubiquitin-like modifiers (SUMOs) regulate a variety of cellular processes through two distinct mechanisms, including covalent sumoylation and non-covalent SUMO interaction. The complexity of SUMO regulations has greatly hampered the large-scale identification of SUMO substrates or interaction partners on a proteome-wide level. In this work, we developed a new tool called GPS-SUMO for the ...
متن کاملSUMOhunt: Combining Spatial Staging between Lysine and SUMO with Random Forests to Predict SUMOylation
Modification with SUMO protein has many key roles in eukaryotic systems which renders the identification of its target proteins and sites of considerable importance. Information regarding the SUMOylation of a protein may tell us about its subcellular localization, function, and spatial orientation. This modification occurs at particular and not all lysine residues in a given protein. In competi...
متن کاملPredicting Protein-Protein Interactions from Protein Sequences Using Phylogenetic Profiles
In this study, a high accuracy protein-protein interaction prediction method is developed. The importance of the proposed method is that it only uses sequence information of proteins while predicting interaction. The method extracts phylogenetic profiles of proteins by using their sequence information. Combining the phylogenetic profiles of two proteins by checking existence of homologs in diff...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004